Dictionaries and phrase tables are the basis of modern statistical machinetranslation systems. This paper develops a method that can automate the processof generating and extending dictionaries and phrase tables. Our method cantranslate missing word and phrase entries by learning language structures basedon large monolingual data and mapping between languages from small bilingualdata. It uses distributed representation of words and learns a linear mappingbetween vector spaces of languages. Despite its simplicity, our method issurprisingly effective: we can achieve almost 90% precision@5 for translationof words between English and Spanish. This method makes little assumption aboutthe languages, so it can be used to extend and refine dictionaries andtranslation tables for any language pairs.
展开▼